28 research outputs found

    New technologies for high throughput genetic analysis of complex genomes

    Get PDF
    High throughput sequencing can generate hundreds of millions of reads in a single day and is revolutionizing modern genetics. This project aimed to utilize next generation genetic approaches to analyze non-model but important agronomical plant species. A key feature of these species is their complexity. Mapping and SNP calling of these sequencing datasets is fundamental to many downstream analyses that have been implemented here including; mutant identification, comparative analyses between related organisms and epigenetic studies. The first objective in this project involved developing accelerated mutant identification techniques using mapping-by-sequencing analyses that combine whole genome sequencing with genetic mapping. Such methods have largely required a complete reference sequence and are typically implemented on a mapping population with a common mutant phenotype of interest. Here mutant identification was demonstrated on the model diploid plant Arabidopsis thaliana as a proof of principle of the methodology. It was also demonstrated on a simulated hexaploid mutant that was developed using the Arabidopsis reference genome. In species such as wheat, no finished genome reference sequence is available and, due to its large genome size (17 Gb), re-sequencing at sufficient depth of coverage is not practical. Therefore a genomic target enrichment approach was validated and used here to capture the gene rich regions of hexaploid bread wheat, reducing the sequencing cost while still allowing analysis of the majority of wheat’s genic sequence. A pseudo-chromosome based reference sequence was developed from this genic sequence with a long-range order of genes based on synteny of wheat with Brachypodium distachyon. Using the capture probe set for target enrichment followed by next generation sequencing; an early flowering locus was mapped in the diploid wheat Triticum monococcum and in hexaploid bread wheat Triticum aestivum, the stripe rust resistance gene was located. A bespoke pipeline and algorithm was developed for mutant loci identification and the pseudo-chromosome reference was implemented. This novel method will allow widespread application of sliding window mapping-by-sequencing analyses to datasets that are; enriched, lacking a finished reference genome or polyploid. The second main objective of this project involved a study of methylation patterns in wheat utilizing sodium bisulfite treatment, combined with target enrichment. An enrichment system was specifically designed, developed, validated and implemented here to perform one of the first studies of methylation patterns in hexaploid bread wheat across the 3 genomes that used a genome-wide subset of genes and can thus be used to infer genome-wide methylation patterns and observations. This investigation confirmed that differential methylation exists between the A, B and D genomes of wheat and that temperature is capable of altering methylation states

    A framework for gene mapping in wheat demonstrated using the Yr7 yellow rust resistance gene

    Get PDF
    We used three approaches to map the yellow rust resistance gene Yr7 and identify associated SNPs in wheat. First, we used a traditional QTL mapping approach using a double haploid (DH) population and mapped Yr7 to a low-recombination region of chromosome 2B. To fine map the QTL, we then used an association mapping panel. Both populations were SNP array genotyped allowing alignment of QTL and genome-wide association scans based on common segregating SNPs. Analysis of the association panel spanning the QTL interval, narrowed the interval down to a single haplotype block. Finally, we used mapping-by-sequencing of resistant and susceptible DH bulks to identify a candidate gene in the interval showing high homology to a previously suggested Yr7 candidate and to populate the Yr7 interval with a higher density of polymorphisms. We highlight the power of combining mapping-by-sequencing, delivering a complete list of gene-based segregating polymorphisms in the interval with the high recombination, low LD precision of the association mapping panel. Our mapping-by-sequencing methodology is applicable to any trait and our results validate the approach in wheat, where with a near complete reference genome sequence, we are able to define a small interval containing the causative gene

    Using genic sequence capture in combination with a syntenic pseudo genome to map a deletion mutant in a wheat species

    Get PDF
    Mapping‐by‐sequencing analyses have largely required a complete reference sequence and employed whole genome re‐sequencing. In species such as wheat, no finished genome reference sequence is available. Additionally, because of its large genome size (17 Gb), re‐sequencing at sufficient depth of coverage is not practical. Here, we extend the utility of mapping by sequencing, developing a bespoke pipeline and algorithm to map an early‐flowering locus in einkorn wheat (Triticum monococcum L.) that is closely related to the bread wheat genome A progenitor. We have developed a genomic enrichment approach using the gene‐rich regions of hexaploid bread wheat to design a 110‐Mbp NimbleGen SeqCap EZ in solution capture probe set, representing the majority of genes in wheat. Here, we use the capture probe set to enrich and sequence an F2 mapping population of the mutant. The mutant locus was identified in T. monococcum, which lacks a complete genome reference sequence, by mapping the enriched data set onto pseudo‐chromosomes derived from the capture probe target sequence, with a long‐range order of genes based on synteny of wheat with Brachypodium distachyon. Using this approach we are able to map the region and identify a set of deleted genes within the interval

    A genome-wide survey of DNA methylation in hexaploid wheat

    Get PDF
    BACKGROUND: DNA methylation is an important mechanism of epigenetic gene expression control that can be passed between generations. Here, we use sodium bisulfite treatment and targeted gene enrichment to study genome-wide methylation across the three sub-genomes of allohexaploid wheat. RESULTS: While the majority of methylation is conserved across all three genomes we demonstrate that differential methylation exists between the sub-genomes in approximately equal proportions. We correlate sub-genome-specific promoter methylation with decreased expression levels and show that altered growing temperature has a small effect on methylation state, identifying a small but functionally relevant set of methylated genes. Finally, we demonstrate long-term methylation maintenance using a comparison between the D sub-genome of hexaploid wheat and its progenitor Aegilops tauschii. CONCLUSIONS: We show that tri-genome methylation is highly conserved with the diploid wheat progenitor while sub-genome-specific methylation shows more variation

    Mapping-by-sequencing in complex polyploid genomes using genic sequence capture: a case study to map yellow rust resistance in hexaploid wheat

    Get PDF
    Previously we extended the utility of mapping-by-sequencing by combining it with sequence capture and mapping sequence data to pseudo-chromosomes that were organized using wheat-Brachypodium synteny. This, with a bespoke haplotyping algorithm, enabled us to map the flowering time locus in the diploid wheat Triticum monococcum L identifying a set of deleted genes (Gardiner et al., 2014). Here, we develop this combination of gene enrichment and sliding window mapping-by-synteny analysis to map the Yr6 locus for yellow stripe rust resistance in hexaploid wheat. A 110MB NimbleGen capture probe set was used to enrich and sequence a doubled-haploid mapping population of hexaploid wheat derived from an Avalon and Cadenza cross. The Yr6 locus was identified by mapping to the POPSEQ chromosomal pseudomolecules using a bespoke pipeline and algorithm (Chapman et al., 2015). Furthermore the same locus was identified using newly developed pseudo-chromosome sequences as a mapping reference that are based on the genic sequence used for sequence enrichment. The pseudo-chromosomes allow us to demonstrate the application of mapping-by-sequencing to even poorly defined polyploidy genomes where chromosomes are incomplete and sub-genome assemblies are collapsed. This analysis uniquely enabled us to: compare wheat genome annotations; identify the Yr6 locus - defining a smaller genic region than was previously possible; associate the interval with one wheat sub-genome and increase the density of SNP markers associated. Finally, we built the pipeline in iPlant, making it a user-friendly community resource for phenotype mapping

    Analysis of the recombination landscape of hexaploid bread wheat reveals genes controlling recombination and gene conversion frequency

    Get PDF
    Background: Sequence exchange between homologous chromosomes through crossing over and gene conversion is highly conserved among eukaryotes, contributing to genome stability and genetic diversity. A lack of recombination limits breeding efforts in crops; therefore, increasing recombination rates can reduce linkage drag and generate new genetic combinations. Results: We use computational analysis of 13 recombinant inbred mapping populations to assess crossover and gene conversion frequency in the hexaploid genome of wheat (Triticum aestivum). We observe that high-frequency crossover sites are shared between populations and that closely related parents lead to populations with more similar crossover patterns. We demonstrate that gene conversion is more prevalent and covers more of the genome in wheat than in other plants, making it a critical process in the generation of new haplotypes, particularly in centromeric regions where crossovers are rare. We identify quantitative trait loci for altered gene conversion and crossover frequency and confirm functionality for a novel RecQ helicase gene that belongs to an ancient clade that is missing in some plant lineages including Arabidopsis. Conclusions: This is the first gene to be demonstrated to be involved in gene conversion in wheat. Harnessing the RecQ helicase has the potential to break linkage drag utilizing widespread gene conversions

    Combining explainable machine learning, demographic and multi-omic data to inform precision medicine strategies for inflammatory bowel disease.

    Get PDF
    Inflammatory bowel diseases (IBDs), including ulcerative colitis and Crohn's disease, affect several million individuals worldwide. These diseases are heterogeneous at the clinical, immunological and genetic levels and result from complex host and environmental interactions. Investigating drug efficacy for IBD can improve our understanding of why treatment response can vary between patients. We propose an explainable machine learning (ML) approach that combines bioinformatics and domain insight, to integrate multi-modal data and predict inter-patient variation in drug response. Using explanation of our models, we interpret the ML models' predictions to infer unique combinations of important features associated with pharmacological responses obtained during preclinical testing of drug candidates in ex vivo patient-derived fresh tissues. Our inferred multi-modal features that are predictive of drug efficacy include multi-omic data (genomic and transcriptomic), demographic, medicinal and pharmacological data. Our aim is to understand variation in patient responses before a drug candidate moves forward to clinical trials. As a pharmacological measure of drug efficacy, we measured the reduction in the release of the inflammatory cytokine TNFα from the fresh IBD tissues in the presence/absence of test drugs. We initially explored the effects of a mitogen-activated protein kinase (MAPK) inhibitor; however, we later showed our approach can be applied to other targets, test drugs or mechanisms of interest. Our best model predicted TNFα levels from demographic, medicinal and genomic features with an error of only 4.98% on unseen patients. We incorporated transcriptomic data to validate insights from genomic features. Our results showed variations in drug effectiveness (measured by ex vivo assays) between patients that differed in gender, age or condition and linked new genetic polymorphisms to patient response variation to the anti-inflammatory treatment BIRB796 (Doramapimod). Our approach models IBD drug response while also identifying its most predictive features as part of a transparent ML precision medicine strategy

    Identification of nitrogen-dependent QTL and underlying genes for root system architecture in hexaploid wheat

    Get PDF
    The root system architecture (RSA) of a crop has a profound effect on the uptake of nutrients and consequently the potential yield. However, little is known about the genetic basis of RSA and resource dependent response in wheat (Triticum aestivum L.). Here, a high-throughput hydroponic root phenotyping system was used to identify N-dependent root traits in a wheat mapping population. Using quantitative trait locus (QTL) analysis, a total of 55 QTLs were discovered for seedling root traits across two N treatments, 25 of which were N-dependent. Transcriptomic analyses were used on a N-dependent root angle QTL on chromosome 2D and 17 candidate genes were identified. Of these N-dependent genes a nitrate transporter 1/peptide transporter (NPF) family gene was upregulated making it an interesting candidate for N signalling and response processes for root angle change. The RNA-seq results provide valuable genetic insight for root angle control, N-dependent responses and candidate genes for improvement of N capture in wheat

    Identification of QTL and underlying genes for root system architecture associated with nitrate nutrition in hexaploid wheat

    Get PDF
    The root system architecture (RSA) of a crop has a profound effect on the uptake of nutrients and consequently the potential yield. However, little is known about the genetic basis of RSA and resource adaptive responses in wheat (Triticum aestivum L.). Here, a high-throughput germination paper-based plant phenotyping system was used to identify seedling traits in a wheat doubled haploid mapping population, Savannah×Rialto. Significant genotypic and nitrate-N treatment variation was found across the population for seedling traits with distinct trait grouping for root size-related traits and root distribution-related traits. Quantitative trait locus (QTL) analysis identified a total of 59 seedling trait QTLs. Across two nitrate treatments, 27 root QTLs were specific to the nitrate treatment. Transcriptomic analyses for one of the QTLs on chromosome 2D, which was found under low nitrate conditions, revealed gene enrichment in N-related biological processes and 28 differentially expressed genes with possible involvement in a root angle response. Together, these findings provide genetic insight into root system architecture and plant adaptive responses to nitrate, as well as targets that could help improve N capture in wheat
    corecore